Goto

Collaborating Authors

 mutual learning


Principled Long-Tailed Generative Modeling via Diffusion Models

Neural Information Processing Systems

Deep generative models, particularly diffusion models, have achieved remarkable success but face significant challenges when trained on real-world, long-tailed datasets-where few "head" classes dominate and many "tail" classes are underrepresented. This paper develops a theoretical framework for long-tailed learning via diffusion models through the lens of deep mutual learning. We introduce a novel regularized training objective that combines the standard diffusion loss with a mutual learning term, enabling balanced performance across all class labels, including the underrepresented tails. Our approach to learn via the proposed regularized objective is to formulate it as a multi-player game, with Nash equilibrium serving as the solution concept. We derive a non-asymptotic first-order convergence result for individual gradient descent algorithm to find the Nash equilibrium.




Model and Feature Diversity for Bayesian Neural Networks in Mutual Learning

Neural Information Processing Systems

However, they often underperform compared to deterministic neural networks. Utilizing mutual learning can effectively enhance the performance of peer BNNs. In this paper, we propose a novel approach to improve BNNs performance through deep mutual learning. The proposed approaches aim to increase diversity in both network parameter distributions and feature distributions, promoting peer networks to acquire distinct features that capture different characteristics of the input, which enhances the effectiveness of mutual learning. Experimental results demonstrate significant improvements in the classification accuracy, negative log-likelihood, and expected calibration error when compared to traditional mutual learning for BNNs.


Socially Aware Music Recommendation: A Multi-Modal Graph Neural Networks for Collaborative Music Consumption and Community-Based Engagement

arXiv.org Artificial Intelligence

This study presents a novel Multi-Modal Graph Neural Network (MM-GNN) framework for socially aware music recommendation, designed to enhance personalization and foster community-based engagement. The proposed model introduces a fusion-free deep mutual learning strategy that aligns modality-specific representations from lyrics, audio, and visual data while maintaining robustness against missing modalities. A heterogeneous graph structure is constructed to capture both user-song interactions and user-user social relationships, enabling the integration of individual preferences with social influence. Furthermore, emotion-aware embeddings derived from acoustic and textual signals contribute to emotionally aligned recommendations. Experimental evaluations on benchmark datasets demonstrate that MM-GNN significantly outperforms existing state-of-the-art methods across various performance metrics. Ablation studies further validate the critical impact of each model component, confirming the effectiveness of the framework in delivering accurate and socially contextualized music recommendations.


Enhancing Graph Neural Networks: A Mutual Learning Approach

arXiv.org Artificial Intelligence

Knowledge distillation (KD) techniques have emerged as a powerful tool for transferring expertise from complex teacher models to lightweight student models, particularly beneficial for deploying high-performance models in resource-constrained devices. This approach has been successfully applied to graph neural networks (GNNs), harnessing their expressive capabilities to generate node embeddings that capture structural and feature-related information. In this study, we depart from the conventional KD approach by exploring the potential of collaborative learning among GNNs. In the absence of a pre-trained teacher model, we show that relatively simple and shallow GNN architectures can synergetically learn efficient models capable of performing better during inference, particularly in tackling multiple tasks. We propose a collaborative learning framework where ensembles of student GNNs mutually teach each other throughout the training process. We introduce an adaptive logit weighting unit to facilitate efficient knowledge exchange among models and an entropy enhancement technique to improve mutual learning. These components dynamically empower the models to adapt their learning strategies during training, optimizing their performance for downstream tasks. Extensive experiments conducted on three datasets each for node and graph classification demonstrate the effectiveness of our approach.



Model and Feature Diversity for Bayesian Neural Networks in Mutual Learning Supplementary Material

Neural Information Processing Systems

We also test the direct maximization of Kullback-Leibler (KL) divergence between feature distributions. As presented in Table A.1, the direct maximization of Direct maximize KL divergence between feature distributions. We further conduct ablation studies focusing on directly maximizing the Kullback-Leibler (KL) divergence between feature distributions of peer Bayesian neural networks (as in setting d in Table A.1). Table A.2, the results for both ResNet20 and ResNet32 BNN models demonstrate that using optimal "*" means Bayesian neural networks that are initialized with the mean value from the pre-trained The results are shown in Table A.3. Figure A.1: Comparison of optimal transport distance between the parameter distributions of peer A.1, it is clear that our proposed method, which promotes A.2, it is clear that our proposed method, which promotes diversity in the feature



NT-ML: Backdoor Defense via Non-target Label Training and Mutual Learning

arXiv.org Artificial Intelligence

Recent studies have shown that deep neural networks (DNNs) are vulnerable to backdoor attacks, where a designed trigger is injected into the dataset, causing erroneous predictions when activated. In this paper, we propose a novel defense mechanism, Non-target label Training and Mutual Learning (NT-ML), which can successfully restore the poisoned model under advanced backdoor attacks. NT aims to reduce the harm of poisoned data by retraining the model with the outputs of the standard training. At this stage, a teacher model with high accuracy on clean data and a student model with higher confidence in correct prediction on poisoned data are obtained. Then, the teacher and student can learn the strengths from each other through ML to obtain a purified student model. Extensive experiments show that NT-ML can effectively defend against 6 backdoor attacks with a small number of clean samples, and outperforms 5 state-of-the-art backdoor defenses.